Concurrent Reinforcement Learning from Customer Interactions
نویسندگان
چکیده
In this paper, we explore applications in which a company interacts concurrently with many customers. The company has an objective function, such as maximising revenue, customer satisfaction, or customer loyalty, which depends primarily on the sequence of interactions between company and customer. A key aspect of this setting is that interactions with different customers occur in parallel. As a result, it is imperative to learn online from partial interaction sequences, so that information acquired from one customer is efficiently assimilated and applied in subsequent interactions with other customers. We present the first framework for concurrent reinforcement learning, using a variant of temporal-difference learning to learn efficiently from partial interaction sequences. We evaluate our algorithms in two largescale test-beds for online and email interaction respectively, generated from a database of 300,000 customer records.
منابع مشابه
A multiagent architecture for concurrent reinforcement learning
In this paper we propose a multiagent architecture for implementing concurrent reinforcement learning, an approach where several agents, sharing the same environment, perceptions and actions, work towards one only objective: learning a single value function. We present encouraging experimental results derived from the initial phase of our research on the combination of concurrent reinforcement ...
متن کاملConcurrent Hierarchical Reinforcement Learning for RoboCup Keepaway
RoboCup Keepaway, originated from the RoboCup soccer simulation 2D challenge, has been widely used as a machine learning benchmark. In this paper, we present a concurrent hierarchical reinforcement learning approach to RoboCup Keepaway. Following the idea of hierarchies of abstract machines (HAMs), we write a partial policy as a HAM from the perspective of a single keeper, run multiple instance...
متن کاملDiscover Relevant Environment Feature Using Concurrent Reinforcement Learning
In order to compare the policies more efficiently, we introduce a new reinforcement learning method called concurrent biased learning. This is a multi-thread learning method, in which each learning thread refers to one feature of the environment. If an agent intentionally focuses on part of these environmental features to learn a policy of a task, we call this method a biased learning; otherwis...
متن کاملResolving Conflicts Among Actions in Concurrent Behaviors: Learning to Coordinate
A robotic agent must coordinate its coupled concurrent behaviors to produce a coherent response to stimuli. Reinforcement learning has been used extensively in coordinating sensing to acting of a single behavior and it has been shown useful in loosely coupled concurrent behaviors. We present a technique for applying Q values developed in learning individual behaviors for coordination among coup...
متن کاملDeep Reinforcement Learning Solutions for Energy Microgrids Management
This paper addresses the problem of efficiently operating the storage devices in an electricity microgrid featuring photovoltaic (PV) panels with both shortand long-term storage capacities. The problem of optimally activating the storage devices is formulated as a sequential decision making problem under uncertainty where, at every time-step, the uncertainty comes from the lack of knowledge abo...
متن کامل